---
title: "Schema"
description: "Base class for creating schemas to represent structured data in Superlinked's embedding space"
---

## Schema

```python
Schema()
```

Inherit your schema class from this class to use as a schema that can be used to represent your structured data.

Schemas translate to entities in the embedding space that you can search by or search for.

### Ancestors (in MRO)

- superlinked.framework.common.schema.id_schema_object.IdSchemaObject
- abc.ABC

## Usage

### Basic Schema Definition

Define a simple schema for your data structure:

```python
from superlinked import Schema

class ProductSchema(Schema):
    id: str
    name: str
    description: str
    price: float
    category: str
    in_stock: bool

# Create an instance to use in your application
product_schema = ProductSchema()
```

### Schema with Complex Types

Use more complex type annotations for richer data structures:

```python
from datetime import datetime
from typing import List, Optional
from superlinked import Schema

class UserSchema(Schema):
    user_id: str
    email: str
    name: str
    age: Optional[int]
    tags: List[str]
    created_at: datetime
    is_active: bool

user_schema = UserSchema()
```

### Event-Based Schema

Create schemas for time-series or event data:

```python
from superlinked import Schema

class InteractionSchema(Schema):
    user_id: str
    item_id: str
    interaction_type: str
    timestamp: datetime
    rating: Optional[float]

interaction_schema = InteractionSchema()
```

## Schema Integration

### With Spaces

Use schemas to define vector spaces:

```python
from superlinked import TextSimilaritySpace, CategoricalSimilaritySpace

# Create spaces based on schema fields
text_space = TextSimilaritySpace(
    text=product_schema.description,
    model="sentence-transformers/all-MiniLM-L6-v2"
)

category_space = CategoricalSimilaritySpace(
    category_input=product_schema.category,
    categories=["electronics", "clothing", "books"]
)
```

### With Data Sources

Connect schemas to data sources:

```python
from superlinked import InMemorySource

# Create a data source for the schema
source = InMemorySource(product_schema)
```

### With Indexes

Organize schemas in indexes for querying:

```python
from superlinked import Index

# Create an index combining multiple spaces
product_index = Index([text_space, category_space])
```

## Schema Properties

Once decorated, your schema gains several important capabilities:

### Entity Representation

- **Embedding Space Entities**: Each schema instance represents an entity type in the vector space
- **Searchable Units**: You can search for entities of this schema type or use them as search criteria
- **Type Safety**: The schema ensures data consistency and type validation

### Field Access

- **Schema Fields**: Access individual fields for use in spaces and queries
- **Type Information**: Maintain type safety throughout the pipeline
- **Validation**: Automatic validation of data against the schema structure

## Best Practices

<Tip>
  **Clear Naming**: Use descriptive class and field names that clearly represent
  your data domain. This improves code readability and makes debugging easier.
</Tip>

<Tip>
  **Type Annotations**: Always include proper type annotations for all fields.
  This enables better validation and IDE support.
</Tip>

<Warning>
  **Required Fields**: Mark optional fields explicitly with `Optional[]` or `|
  None`. All other fields are considered required.
</Warning>

<Note>
  **Schema Evolution**: Changes to schema definitions may require rebuilding
  indexes and reprocessing data. Plan schema changes carefully in production
  environments.
</Note>

## Common Patterns

### Multiple Schema Relationships

Define related schemas for complex data models:

```python
from superlinked import Schema

class AuthorSchema(Schema):
    author_id: str
    name: str
    biography: str

class BookSchema(Schema):
    book_id: str
    title: str
    author_id: str  # References AuthorSchema
    isbn: str
    publication_year: int

author_schema = AuthorSchema()
book_schema = BookSchema()
```

### Hierarchical Data

Structure schemas for hierarchical or nested data:

```python
from superlinked import Schema
from typing import Optional

class CategorySchema(Schema):
    category_id: str
    name: str
    parent_category_id: Optional[str]

class ProductSchema(Schema):
    product_id: str
    name: str
    category_id: str  # References CategorySchema
    subcategory_id: Optional[str]
```
